Measuring Short Text Reuse for the Urdu Language

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

COUNTER: corpus of Urdu news text reuse

Text reuse is the act of borrowing text from existing documents to create new texts. Freely available and easily accessible large online repositories are not only making reuse of text more common in society but also harder to detect. A major hindrance in the development and evaluation of existing/new mono-lingual text reuse detection methods, especially for South Asian languages, is the unavail...

متن کامل

METER: MEasuring TExt Reuse

In this paper we present results from the METER (MEasuring TExt Reuse) project whose aim is to explore issues pertaining to text reuse and derivation, especially in the context of newspapers using newswire sources. Although the reuse of text by journalists has been studied in linguistics, we are not aware of any investigation using existing computational methods for this particular task. We inv...

متن کامل

An Efficient Method for Urdu Language Text Search in Image Based Urdu Text

This paper describes an efficient method for Urdu text search in computer generated and handwritten scanned images. An efficient text search technology is necessary because of increasing handled document every day. This method is unique and simple in the sense that no features are extracted. The proposed method is script independent. The input image is directly matched with a set of prototype c...

متن کامل

Multi Language Text Editor for Burushaski and Urdu through Unicode

This paper introduces an isolated and unique ancient language Burushaski, spoken in Hunza, Nagar, Yasin and parts of Gilgit in the Northern Areas of Pakistan. It explains the working mechanism of Multi Language Text Editor for Urdu and Burushaski. It is developed under the use of ISO/IEC 10646 Unicode standards for Urdu and Burushaski open-type fonts. It gives an ample opportunity to this regio...

متن کامل

Measuring Text Reuse in a Journalistic Domain

This paper describes a general framework for measuring text reuse. This term is used to describe how content from a single or multiple number of known sources can be reused either verbatim (word-for-word copy) or otherwise rewritten depending upon factors influencing the creation of a new document. These may include reduction/ increase in length, change of style, simplification of content, shif...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2018

ISSN: 2169-3536

DOI: 10.1109/access.2017.2776842